Goto

Collaborating Authors

 review spam


A Unified Model for Unsupervised Opinion Spamming Detection Incorporating Text Generality

Xu, Yinqing (The Chinese University of Hong Kong) | Shi, Bei (The Chinese University of Hong Kong) | Tian, Wentao (The Chinese University of Hong Kong) | Lam, Wai (The Chinese University of Hong Kong)

AAAI Conferences

Unlike other forms of spamming, it is difficult to collect a large amount of gold-standard labels for reviews Many existing methods on review spam detection by means of manual effort. Thus, most of these methods considering text content merely utilize simple text [Mukherjee et al., 2013; Li et al., 2013a; Sun et al., features such as content similarity. We explore a 2013] just rely on the ad-hoc or pseudo fake or non-fake novel idea of exploiting text generality for improving labels for model training, such as the labels annotated by spam detection. Besides, apart from the task the Amazon anonymous online workers [Ott et al., 2011; of review spam detection, although there have also Li et al., 2014]. On the other hand, some unsupervised been some works on identifying the review spammers methods have been proposed to detect the individual review (users) and the manipulated offerings (items), spammer [Mukherjee et al., 2013; Lim et al., 2010; no previous works have attempted to solve these Wang et al., 2011] and review spammer groups [Mukherjee et three tasks in a unified model. We have proposed al., 2012]. In addition, time series pattern [Xie et al., 2012], a unified probabilistic graphical model to detect rating distribution [Feng et al., 2012], reviewer graph [Wang et the suspicious review spams, the review spammers al., 2011], and reviewing burstiness [Fei et al., 2013] have also and the manipulated offerings in an unsupervised been applied to identify the review spams in an unsupervised manner.


Learning to Identify Review Spam

Li, Fangtao Huang (Tsinghua University) | Huang, Minlie (Tsinghua University) | Yang, Yi (Tsinghua University) | Zhu, Xiaoyan (Tsinghua University)

AAAI Conferences

In the past few years, sentiment analysis and opinion mining becomes a popular and important task. These studies all assume that their opinion resources are real and trustful. However, they may encounter the faked opinion or opinion spam problem. In this paper, we study this issue in the context of our product review mining system. On product review site, people may write faked reviews, called review spam, to promote their products, or defame their competitors' products. It is important to identify and filter out the review spam. Previous work only focuses on some heuristic rules, such as helpfulness voting, or rating deviation, which limits the performance of this task. In this paper, we exploit machine learning methods to identify review spam. Toward the end, we manually build a spam collection from our crawled reviews. We first analyze the effect of various features in spam identification. We also observe that the review spammer consistently writes spam. This provides us another view to identify review spam: we can identify if the author of the review is spammer. Based on this observation, we provide a two-view semi-supervised method, co-training, to exploit the large amount of unlabeled data. The experiment results show that our proposed method is effective. Our designed machine learning methods achieve significant improvements in comparison to the heuristic baselines.